Building an OS as a distributed system
namespaces
- DNS is good for organizations and power users; Individuals - ?
- but: the system should be usable in isolation too; local namespaces, mergeable up to the global namespace?
- content should be signed as much as possible, possibly encrypted by private keys.
binary code
- platform-agnostic VM; making native code is up to the machine; possibly with global blockchain-like caches
- boot: a small bootstrap image with native code
- modulekernel, not micro- or monolithic one. Modules are composed dynamically and are not tied to being a process or not.
- threads are an explicit concept, untied from processes and memory management.
state and storage:
- content-addressable as much as possible;
- can always be replicated and distributed
- a torrent-like sync protocol
- 1 source -> many replicas,
- globally consistent - how?
- can be partially synchronized between processes and machines
- $shared \otimes mutable$ as much as possible; mandatory locking with timeouts;
- append-only when makes sense.
- databases with locking and transactions otherwise.
- no stinking file systems by default. - but: still compatible with storage filesystems and hierarchical namespaces
identities
- every entity (e.g. file, filesystem) is signed by its owner into a hash that proves ownership
- machines have their key, ideally in TPM => they sign owner's key to prove they belong to them.
- PKI: can be based on https certs and acme (Let's Encrypt). Federated identity providers?
- delegation of some rights to groups?
- binary code should be signed by vendor identity and accountable by the party that provides it; user does not delegate full authority by default to the code they run. Some running code has dynamically managed authority, $A_{code} = policy(A_{user}, A_{vendor})$ . POSIX manages user permissions reasonably, but: acknowledge that vendor permission management is a pressing and unsolved issue. Good UX for end-user machines?
- move the admin responsibilities (e.g. user permissions) out of kernel into stateless p2p mechanisms as much as possible.
- Syncthing-like trust model + delegation by certificates?
interaction:
- file-descriptor-like channels globally
- no "transparent network" stupidity; but: if everything ends up looking like network, then why not. Shared memory should be encouraged, but explicit.
- reified into messages that can be sent;
- rich schemas and IDL; identified by hashes, can be partially synced; machines find "common language" to communicate and "teach" each other of known schemas when needed.
- batching of remote client-server interaction, rich language of allowed server-side actions? Compress as much as possible into as few messages as possible.
- untrusted parties: creating a common "interaction space" encrypted by both (~mTLS)
- split kernel into isolated modules that can be linked into an image via JIT when needed. On-machine compilation or deferring it to explicitly trusted parties. Compiler and debugger is a required part of the system, even if it's a remote one?
- live migration of processes/content between owned nodes (like in Arcan FE)
Composing distributed systems into an OS
References: